Visulaizations

PCA

Unique SNPs per Parent Term:

Cancer Cardiovascular disease Digestive System Disorder Immune System Disorder Metabolic Disorder Neurological Disorder Other Disease Total
66 51 75 74 18 162 105 477

Number SNPs above Fst Thresholds per Sub Population

All Diseases:

Monogenic Diseases:

Prevalence Vs Mean Allele Frequency

CRITICAL NOTE ON THESE PLOTS:

Unfortunately these plots were not able to be configured as I desired. For unknown reasons the row order was not able to be changed, and due to time constraints I wasn’t able to resolve this issue before generating the plots. Further the GBR sub population within the “mean-allele-frequency” bar graphs is evidently incorrect. Some error in preprocessing occurred with this sub population which can be seen in these graphs, thus the GBR population comparison here should be ignored.

Correlation Table

Bar Plots:

Top 20 SNPs (by Fst) Per Super Population

Admixed American

African

East Asian

South Asian

European

Tables

NOTE: All tables will be displayed partially here and can be downloaded for complete viewing from the “tables_(csv)” folder

Top 20 Disease SNPs per Sub Population

VariantID MAPPED_GENE EnsVar_most_severe_consequence DISEASE/TRAIT MAPPED_TRAIT Parent term population_labels Fst_per_population
rs10032909 LINC01088 intron_variant Systemic lupus erythematosus obsolete_systemic lupus erythematosus Immune system disorder EUR 0.3539
rs10035291 SSBP2 intron_variant Bipolar disorder obsolete_bipolar disorder Neurological disorder LWK 0.5033
rs1012621 TGM3 intron_variant Inflammatory skin disease atopic eczema, psoriasis Other disease ASW 0.3138
rs1012621 TGM3 intron_variant Inflammatory skin disease atopic eczema, psoriasis Immune system disorder ASW 0.3138
rs1016189 MAGI2 intron_variant Acute graft-versus-host disease (gut) (recipient effect) acute graft vs. host disease Immune system disorder FIN 0.3603
rs10245867 JAZF1 intron_variant Multiple sclerosis multiple sclerosis Immune system disorder CHS 0.4332

Top 10 Monogenic Disease SNPs per Sub Population

VariantID MAPPED_GENE EnsVar_most_severe_consequence DISEASE/TRAIT MAPPED_TRAIT Parent term population_labels_mono Fst_per_population
rs10031265 PKD2 intron_variant Serum creatinine levels creatinine measurement Other measurement CEU|CLM|ESN|GBR 0.2316|0.1771|0.3568|0.2268
rs10151945 ATXN3 3_prime_UTR_variant Lung function (FVC) vital capacity Other measurement AMR|PEL 0.1065|0.3051
rs10154834 GLB1 intron_variant Medication use (glucocorticoids) Glucocorticoid use measurement Other measurement CLM|ITU 0.2468|0.192
rs10533065 CASR intron_variant Eosinophil counts eosinophil count Hematological measurement ASW 0.2245
rs10744953 FBN1 intron_variant Ascending aorta minimum area aortic measurement Cardiovascular measurement EUR|PJL|PUR 0.2747|0.1734|0.1734
rs10744953 FBN1 intron_variant Ascending aorta minimum area aortic measurement Cardiovascular measurement EUR|PJL|PUR 0.2747|0.1734|0.1734

Complete Set of Monogenic Disease Associated Variants

VariantID DISEASE/TRAIT MAPPED_GENE CONTEXT INTERGENIC RISK ALLELE FREQUENCY MAPPED_TRAIT EnsVar_most_severe_consequence Parent term monogenic_searched_resource monogenic_disease_name
rs1078793 Age at loss of ambulation in Duchenne muscular dystrophy NA - RAMP3 regulatory_region_variant 1 NR Duchenne muscular dystrophy, disease progression measurement regulatory_region_variant Cardiovascular disease A. Nesterova, Monogenic rare diseases in biomedical databases and text mining SEE DISEASE/TRAIT
rs1078793 Age at loss of ambulation in Duchenne muscular dystrophy NA - RAMP3 regulatory_region_variant 1 NR Duchenne muscular dystrophy, disease progression measurement regulatory_region_variant Other measurement A. Nesterova, Monogenic rare diseases in biomedical databases and text mining SEE DISEASE/TRAIT
rs11017928 Age at loss of ambulation in Duchenne muscular dystrophy DOCK1 intron_variant 0 NR Duchenne muscular dystrophy, disease progression measurement intron_variant Cardiovascular disease A. Nesterova, Monogenic rare diseases in biomedical databases and text mining SEE DISEASE/TRAIT
rs11017928 Age at loss of ambulation in Duchenne muscular dystrophy DOCK1 intron_variant 0 NR Duchenne muscular dystrophy, disease progression measurement intron_variant Other measurement A. Nesterova, Monogenic rare diseases in biomedical databases and text mining SEE DISEASE/TRAIT
rs11641605 Age at loss of ambulation in Duchenne muscular dystrophy LINC02141 intron_variant 0 NR Duchenne muscular dystrophy, disease progression measurement intron_variant Cardiovascular disease A. Nesterova, Monogenic rare diseases in biomedical databases and text mining SEE DISEASE/TRAIT
rs11641605 Age at loss of ambulation in Duchenne muscular dystrophy LINC02141 intron_variant 0 NR Duchenne muscular dystrophy, disease progression measurement intron_variant Other measurement A. Nesterova, Monogenic rare diseases in biomedical databases and text mining SEE DISEASE/TRAIT

R Version & Packages Used

Version:

platform x86_64-w64-mingw32
arch x86_64
os mingw32
crt ucrt
system x86_64, mingw32
status
major 4
minor 2.1
year 2022
month 06
day 23
svn rev 82513
language R
version.string R version 4.2.1 (2022-06-23 ucrt) nickname Funny-Looking Kid

Packages:

Package Version

curl 4.3.2

data.table 1.14.2

dplyr 1.0.10

esquisse 1.1.2

ggplot2 3.3.6

ggrepel 0.9.3

httr 1.4.4

jsonlite 1.8.0

knitr 1.40

purrr 0.3.4

tibble 3.1.8

tidyr 1.2.1

GWASpops.pheno2geno 0.900